Welcome to the 10x Genomics SIB Days 2020 - virtual conference Visium Spatial Transcriptomics workshop!
The purpose of this tutorial is to walk users through some of the steps necessary to explore data produced by the 10x Genomics Visium Spatial Gene Expression Solution and the Spaceranger pipeline. All datasets that we will investigate to day are all freely available from 10x Genomics.
Please note that this tutorial is largely an extension off of the primary Seurat Visium Tutorial
Things to know about this workshop
/mnt/libs/shared_data/[]
library(Seurat)
library(ggplot2)
library(patchwork)
library(dplyr)
library(RColorBrewer)
If you have problems we’ll set the libPath to make sure we are all using the same set of pre-installed libraries.
.libPaths(new = "/usr/lib64/R/library")
Now we’ll load up the dataset that will be used from this point forward using Seurat::Load10X_Spatial function.
Real Dataset for the tutorial
breast_cancer <- Load10X_Spatial(data.dir = "/mnt/libs/shared_data/human_breast_cancer_1/outs/",
filename = "V1_Breast_Cancer_Block_A_Section_1_filtered_feature_bc_matrix.h5")
Note that the Default Assay is set to “Spatial”
DefaultAssay(breast_cancer)
[1] "Spatial"
It’s good to note that there are a bunch of Visium data sets hosted by the Satija lab in the Seurat Data Package.
It’s very easy to add metadata to your Seurat object with any values you want to check out on top of the defaults.
mito.gene.names <- grep("^mt-", rownames(breast_cancer@assays$Spatial), value = TRUE, ignore.case = TRUE)
col.total <- Matrix::colSums(breast_cancer@assays$Spatial)
breast_cancer <- AddMetaData(breast_cancer, Matrix::colSums(breast_cancer@assays$Spatial[mito.gene.names, ]) / col.total, "pct.mito")
Let’s have a look at some basic QC information. Keep in mind that most Seurat plots are ggplot object and can be manipulated as such.
Counts = UMI
Features = Genes
plot1 <- VlnPlot(breast_cancer, features = "nCount_Spatial", pt.size = 0.1) +
ggtitle("UMI") +
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot2 <- VlnPlot(breast_cancer, features = "nFeature_Spatial", pt.size = 0.1) +
ggtitle("Genes") +
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot3 <- VlnPlot(breast_cancer, features = "pct.mito", pt.size = 0.1) +
ggtitle("Percentage Mito genes") +
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot4 <- SpatialFeaturePlot(breast_cancer, features = "nCount_Spatial") +
theme(legend.position = "right")
plot5 <- SpatialFeaturePlot(breast_cancer, features = "nFeature_Spatial") +
theme(legend.position = "right")
plot6 <- SpatialFeaturePlot(breast_cancer, features = "pct.mito") +
theme(legend.position = "right")
plot1 + plot2 + plot3 + plot4 + plot5 + plot6 + plot_layout(nrow = 2, ncol = 3)
Spaceranger does UMI normalization for clustering and differential expression but does not return that normalized matrix.
Let’s have a look at pre-normalization raw UMI counts. Feel free to change these genes or add genes.
SpatialFeaturePlot(breast_cancer, features = c("ERBB2", "CD8A", "MT-ND1"))
Don’t worry about reachediteration limit warnings. See https://github.com/ChristophH/sctransform/issues/25 for discussion
Default assay will now be set to SCT
breast_cancer <- SCTransform(breast_cancer, assay = "Spatial", verbose = FALSE)
Now let’s have a look at SCT normalized UMI counts for these same genes. The Default Assays is now “SCT”
SpatialFeaturePlot(breast_cancer, features = c("ERBB2", "CD8A"))
From Seurat:
The default parameters in Seurat emphasize the visualization of molecular data. However, you can also adjust the size of the spots (and their transparency) to improve the visualization of the histology image, by changing the following parameters:
p1 <- SpatialFeaturePlot(breast_cancer, features = "IGFBP5", pt.size.factor = 1)+
theme(legend.position = "right") +
ggtitle("Actual Spot Size")
p2 <- SpatialFeaturePlot(breast_cancer, features = "IGFBP5")+
theme(legend.position = "right") +
ggtitle("Scaled Spot Size")
p1 + p2
We can then proceed to run dimensionality reduction and clustering on the RNA expression data, using the same workflow as we use for scRNA-seq analysis.
Some of these processes can be parallized please see Parallelization in Seurat for more info
The default UMAP calculation is performed with the R-based UWOT library However, you can run UMAP in python via the reticulate library and umap-learn. We have found that for smaller data sets (<= 10k cells/spots) UWOT is great. For much larger data sets (100k + cells/spots) umap-learn can be a faster option.
First Let’s Run our PCA. How many PCs should we use going forward?
breast_cancer <- RunPCA(breast_cancer, assay = "SCT", verbose = FALSE)
breast_cancer <- FindVariableFeatures(breast_cancer)
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
ElbowPlot(breast_cancer, ndims = 40)
Now let’s cluster and project to UMAP. Does 30 PCs look okay? What if we changed the number of dimensions to 20?
breast_cancer <- FindNeighbors(breast_cancer, reduction = "pca", dims = 1:30)
Computing nearest neighbor graph
Computing SNN
breast_cancer <- FindClusters(breast_cancer, verbose = FALSE)
breast_cancer <- RunUMAP(breast_cancer, reduction = "pca", dims = 1:30)
11:29:34 UMAP embedding parameters a = 0.9922 b = 1.112
11:29:34 Read 3822 rows and found 30 numeric columns
11:29:34 Using Annoy for neighbor search, n_neighbors = 30
11:29:34 Building Annoy index with metric = cosine, n_trees = 50
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
11:29:35 Writing NN index file to temp file /tmp/RtmpRqpjKF/file4d0f925f1c450
11:29:35 Searching Annoy index using 1 thread, search_k = 3000
11:29:36 Annoy recall = 100%
11:29:37 Commencing smooth kNN distance calibration using 1 thread
11:29:37 Initializing from normalized Laplacian + noise
11:29:41 Commencing optimization for 500 epochs, with 157848 positive edges
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
11:29:50 Optimization finished
Now let’s have a look at the clustering in UMAP space
p1 <- DimPlot(breast_cancer, reduction = "umap", label = FALSE) +
labs(color = "Cluster")
p2 <- FeaturePlot(breast_cancer, features = c('nFeature_Spatial','pct.mito'), dims = 1:2)
p1 / p2
Here’s the clustering in UMAP and image space
p1 <- DimPlot(breast_cancer, reduction = "umap", label = TRUE) +
labs(color = "Cluster")
p2 <- SpatialDimPlot(breast_cancer, label = TRUE, label.size = 3) +
labs(fill = "Cluster")
p1 + p2 + plot_annotation(
title = 'Clustering in UMAP and Tissue Space',
caption = 'Processed by Spaceranger 1.1\nNormalization and Clustering by Seurat'
) + plot_layout(nrow = 1)
I don’t really like these colors so let’s change them manually
p1 <- DimPlot(breast_cancer, reduction = "umap", label = TRUE) +
labs(color = "Cluster") +
scale_color_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green"))
p2 <- SpatialDimPlot(breast_cancer, label = TRUE, label.size = 3) +
labs(fill = "Cluster")+
scale_fill_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green"))
p1 + p2 + plot_annotation(
title = 'Clustering in UMAP and Tissue Space',
caption = 'Processed by Spaceranger 1.1\nNormalization and Clustering by Seurat'
) + plot_layout(nrow = 1)
If interested you can also now look at UMI and Gene counts per cluster as well
plot1 <- VlnPlot(breast_cancer, features = "nCount_Spatial", pt.size = 0.1) +
ggtitle("UMI") +
scale_fill_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green"))+
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot2 <- VlnPlot(breast_cancer, features = "nFeature_Spatial", pt.size = 0.1) +
ggtitle("Genes") +
scale_fill_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green"))+
theme(axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot1 + plot2
We can also look at some of our QC information by cluster now that we’ve processed the data
plot1 <- VlnPlot(breast_cancer, features = "nCount_Spatial", pt.size = 0.1) +
ggtitle("UMI") +
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot2 <- VlnPlot(breast_cancer, features = "nFeature_Spatial", pt.size = 0.1) +
ggtitle("Genes") +
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot3 <- VlnPlot(breast_cancer, features = "pct.mito", pt.size = 0.1) +
ggtitle("Percentage Mito genes") +
theme(axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "right") +
NoLegend()
plot4 <- SpatialFeaturePlot(breast_cancer, features = "nCount_Spatial") +
theme(legend.position = "right")
plot5 <- SpatialFeaturePlot(breast_cancer, features = "nFeature_Spatial") +
theme(legend.position = "right")
plot6 <- SpatialFeaturePlot(breast_cancer, features = "pct.mito") +
theme(legend.position = "right")
plot1 + plot2 + plot3 + plot4 + plot5 + plot6 + plot_layout(nrow = 2, ncol = 3)
Now let’s take a look at at a gene of interest with violin plots but also in image space. The triangles represent the mean expression of each cluster.
p1 <- VlnPlot(breast_cancer, features = "IGFBP5", pt.size = 0.1) +
ggtitle("IGFBP5") +
scale_fill_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green"))+
theme(axis.title.x = element_blank(),
legend.position = "right") +
NoLegend() +
stat_summary(fun=mean, geom="point", shape=23, size=4, color="red")
p2 <- SpatialFeaturePlot(breast_cancer, features = "IGFBP5")+
theme(legend.position = "right")
p3 <- SpatialDimPlot(breast_cancer, label = TRUE, label.size = 3) +
labs(fill = "Cluster")+
scale_fill_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green")) +
NoLegend()
row1 <- p2 + p3 + plot_layout(nrow = 1)
row1 + p1+ plot_layout(nrow = 2, widths = c(0.5, 0.5))
We can also look at these data interactively. This function can be a little slow but also very useful to visualize expression in different projection spaces. We won’t run this today
LinkedDimPlot(breast_cancer)
First we’ll identify deferentially expressed genes. Let’s find all the markers for every cluster. We’ve already pre calculated these for you so let’s just load them up.
de_markers <- readRDS(file = "/mnt/libs/shared_data/de_markers.rds")
de_markers %>%
group_by(cluster) %>%
top_n(n = 2, wt = avg_logFC)
Originally this was processed with
de_markers <- FindAllMarkers(breast_cancer, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
de_markers_up <- de_markers %>%
arrange(-avg_logFC)
de_markers_down <- de_markers %>%
arrange(avg_logFC)
de_markers_up
de_markers_down
SpatialFeaturePlot(object = breast_cancer, features = de_markers_up$gene[1:13], alpha = c(0.1, 1), ncol = 3)
SpatialFeaturePlot(object = breast_cancer, features = de_markers_down$gene[1:13], alpha = c(0.1, 1), ncol = 3)
What are the top variable features?
VariableFeatures(breast_cancer)[1:10]
[1] "IGLC2" "ALB" "IGHG3" "IGHM" "IGHA1" "IGKC" "IGHG1" "IGLC3" "IGHG2" "CPB1"
What are the top DE genes?
rownames(de_markers)[1:10]
[1] "IGHG1" "IGKC" "IGHG3" "IGLC2" "IGHG4" "C3" "CYBA" "IGLC3" "SFRP2" "TIMP1"
So what about spatial enrichment? This can be a very informative analysis tool that takes into the spatial relationship of each gene.
Some methods for these approaches are:
Using the top 100 variable genes find spatially enriched ones. Note that in the Seurat Spatial Tutorial they use 1000 genes (this can take a long time). You can also use all genes but that will take a long time. Using a calculation of Morans I can sometimes be a faster approach, especially if you are using parallization. Here we’ll do both.
While this process is running it is a good time to take a short couple minute break, catch up, or ask questions.
breast_cancer <- FindSpatiallyVariableFeatures(breast_cancer,
assay = "SCT",
slot = "scale.data",
features = VariableFeatures(breast_cancer)[1:100],
selection.method = "markvariogram", verbose = TRUE)
Have a look at the spatially variable genes calculated by markvariogram ordered from most variable to least variable
SpatiallyVariableFeatures(breast_cancer, selection.method = "markvariogram", decreasing = TRUE)
[1] "CRISP3" "CXCL14" "MGP" "CPB1" "SLITRK6" "TTLL12" "MALAT1" "AGR2" "ALB"
[10] "GFRA1" "S100G" "CSTA" "DEGS1" "TFF3" "IGLC2" "IGHG3" "C6orf141" "TFF1"
[19] "IGKC" "HEBP1" "IGHG1" "APOE" "ZNF350-AS1" "AC087379.2" "IGHG4" "C3" "FCGR3B"
[28] "TIMP1" "LINC00645" "IGHM" "SCGB2A2" "KRT14" "IGLC3" "KRT17" "LYZ" "APOC1"
[37] "SCGB1D2" "STC2" "IGHA1" "C1QA" "AEBP1" "APOD" "KRT5" "PGM5-AS1" "MMP7"
[46] "CCL19" "COL6A2" "TAGLN" "BGN" "S100A9" "IGHG2" "COL1A2" "DCN" "SPP1"
[55] "COL1A1" "CGA" "VIM" "IGFBP7" "FN1" "CCDC80" "CXCL9" "IGHA2" "TRBC2"
[64] "SFRP2" "CD52" "KRT6B" "S100A2" "LUM" "COL3A1" "IGLC7" "SAA1" "CARTPT"
[73] "COMP" "S100A8" "JCHAIN" "CST1" "PTGDS" "SFRP4" "CD79A" "CCL21" "FABP4"
[82] "MUC19" "ACKR1" "POSTN" "MMP9" "S100A7" "VWF" "AQP1" "CTGF" "A2M"
[91] "SPARCL1" "ACTA2" "MS4A1" "IGLL5" "MYH11" "CXCL10" "IGHD" "HBB" "MMP1"
[100] "TPSB2"
top.features_trendseq <- head(SpatiallyVariableFeatures(breast_cancer, selection.method = "markvariogram"), 8)
SpatialFeaturePlot(breast_cancer, features = top.features_trendseq, ncol = 4, alpha = c(0.1, 1))
Moran’s I implementation. For other spatial data types the x.cuts and y.cuts determines the grid that is laid over the tissue in the capture area. Here we’ll remove those
breast_cancer <- FindSpatiallyVariableFeatures(breast_cancer,
assay = "SCT",
slot = "scale.data",
features = VariableFeatures(breast_cancer)[1:100],
selection.method = "moransi")
Computing Moran's I
Have a look at the spatially variable genes calculated by moransi ordered from most variable to least variable
SpatiallyVariableFeatures(breast_cancer, selection.method = "moransi", decreasing = TRUE)
[1] "CRISP3" "CXCL14" "TTLL12" "SLITRK6" "GFRA1" "AGR2" "MGP" "ALB" "MALAT1"
[10] "CPB1" "DEGS1" "C6orf141" "CSTA" "TFF3" "LINC00645" "FCGR3B" "S100G" "TFF1"
[19] "HEBP1" "C3" "ZNF350-AS1" "SCGB1D2" "APOD" "IGLC2" "TIMP1" "IGHG3" "SCGB2A2"
[28] "APOC1" "KRT14" "LYZ" "CCDC80" "APOE" "TAGLN" "IGHG1" "AC087379.2" "CCL19"
[37] "SPP1" "PGM5-AS1" "IGKC" "S100A9" "KRT17" "STC2" "IGHM" "FN1" "KRT5"
[46] "IGFBP7" "C1QA" "COL6A2" "AQP1" "BGN" "CXCL9" "ACKR1" "CARTPT" "DCN"
[55] "AEBP1" "IGLC3" "IGHG4" "VIM" "S100A2" "IGHG2" "MMP7" "IGHA1" "CCL21"
[64] "KRT6B" "TRBC2" "COL1A2" "CGA" "SFRP2" "VWF" "COL1A1" "SAA1" "SPARCL1"
[73] "CD52" "MUC19" "COL3A1" "SFRP4" "S100A8" "POSTN" "PTGDS" "IGHA2" "JCHAIN"
[82] "CD79A" "ACTA2" "CTGF" "LUM" "CST1" "A2M" "S100A7" "COMP" "MS4A1"
[91] "IGLC7" "MMP9" "HBB" "FABP4" "MYH11" "MMP1" "IGLL5" "CXCL10" "TPSB2"
[100] "IGHD"
top.features_moransi <- head(SpatiallyVariableFeatures(breast_cancer, selection.method = "moransi"), 8)
SpatialFeaturePlot(breast_cancer, features = top.features_moransi, ncol = 4, alpha = c(0.1, 1))
We can see that the results are slightly different. So let’s take a look at what those difference are
spatially_variable_genes <- breast_cancer@assays$SCT@meta.features %>%
tidyr::drop_na()
spatially_variable_genes
You can see the two methods show
mm_cor <- cor.test(spatially_variable_genes$moransi.spatially.variable.rank, spatially_variable_genes$markvariogram.spatially.variable.rank)
ggplot(spatially_variable_genes, aes(x=moransi.spatially.variable.rank,y=markvariogram.spatially.variable.rank))+
geom_point()+
geom_smooth()+
xlab("Morans I Rank")+
ylab("Markvariogram Rank")+
annotate("text", x = 25, y = 75, label = paste("Pearson's Correlation\n", round(mm_cor$estimate[1], digits = 2), sep = ""))+
theme_bw()
We can identify these outliers interactively using ggplotly
plotly::ggplotly(
ggplot(spatially_variable_genes, aes(x=moransi.spatially.variable.rank,y=markvariogram.spatially.variable.rank, label =row.names(spatially_variable_genes)))+
geom_point()+
geom_smooth()+
xlab("Morans I Rank")+
ylab("Markvariogram Rank")+
annotate("text", x = 25, y = 75, label = paste("Pearson's Correlation\n", round(mm_cor$estimate[1], digits = 2), sep = ""))+
theme_bw()
)
Where are these genes being expressed relative to pathologist annotation?
plotly::ggplotly(
ggplot(spatially_variable_genes, aes(x=moransi.spatially.variable.rank,y=markvariogram.spatially.variable.rank, label =row.names(spatially_variable_genes)))+
geom_point()+
geom_smooth()+
xlab("Morans I Rank")+
ylab("Markvariogram Rank")+
annotate("text", x = 25, y = 75, label = paste("Pearson's Correlation\n", round(mm_cor$estimate[1], digits = 2), sep = ""))+
theme_bw()
)
`geom_smooth()` using method = 'loess' and formula 'y ~ x'
Looks like the Matrix Gla protein ( MGP ) gene is enriched in Ductal Carcinoma In Situ. Not a lot is known about MGP in the context of cancer but it looks like it could be an interesting novel gene to investigate with regard to Ductal Carcinoma In Situ.
ca <- readbitmap::read.bitmap("/mnt/home/stephen.williams/yard/Odin/SIB_2020_Workshop/images/Breast Cancer Path.png")
# in the tutorial
# ca <- readbitmap::read.bitmap('/mnt/libs/shared_data/human_breast_cancer_1/images/Breast_Cancer_Path.png')
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(ca,0,0,1,1)
Here we have a preprocessed Seurat object with 10k nuclei annotated from a breast cancer sample. Don’t bother too much with the details of how this data was generated they don’t particularly matter for our purposes.
bc_snRNA <- readRDS("/mnt/libs/shared_data/bc_snRNA.rds")
bc_snRNA
It’s always a good idea to rerun normalization to make sure your data is in the correct format before moving forward with integration. We’ve already preprocessed this dataset.
bc_snRNA <- SCTransform(bc_snRNA, ncells = 3000, verbose = FALSE) %>%
RunPCA(verbose = FALSE) %>%
RunUMAP(dims = 1:30)
snRNA Class
bc_snRNA
An object of class Seurat
56988 features across 10000 samples within 2 assays
Active assay: SCT (23450 features)
1 other assay present: RNA
3 dimensional reductions calculated: pca, harmony, umap
Subclass
DimPlot(bc_snRNA, group.by = "ident", label = FALSE) +
scale_color_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green", "darkgreen"))
What genes define some cell types?
# Find markers for every cluster compared to all remaining cells, report only the positive ones
# This takes a bit of time so we'll skip it and move on to specific cell types
de_markers_snRNA <- FindAllMarkers(bc_snRNA, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
de_markers_snRNA %>%
group_by(cluster) %>%
top_n(n = 2, wt = avg_logFC)
Notice here that we are using test.use = "roc" which is a AUC classifier which will give us an idea as to how well any given gene defines a cell type.
Find markers that define Tumor cells
de_markers_tumor <- FindMarkers(bc_snRNA, ident.1 = "Likely tumor cells", logfc.threshold = 0.25, test.use = "roc", only.pos = TRUE, verbose = FALSE)
de_markers_tumor %>%
tibble::rownames_to_column("gene") %>%
arrange(-power)
Find markers that define T cells
de_markers_tcell <- FindMarkers(bc_snRNA, ident.1 = "T cells", logfc.threshold = 0.25, test.use = "roc", only.pos = TRUE, verbose = FALSE)
de_markers_tcell %>%
tibble::rownames_to_column("gene") %>%
arrange(-power)
Find markers that define stem cells
(FeaturePlot(bc_snRNA, features = "SKAP1")|
DimPlot(bc_snRNA, group.by = "ident", label = FALSE)+
scale_color_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green", "darkgreen")))
de_markers_stemcell <- FindMarkers(bc_snRNA, ident.1 = "CD49f-hi MaSCs", logfc.threshold = 0.25, test.use = "roc", only.pos = TRUE, verbose = FALSE)
(FeaturePlot(bc_snRNA, features = "IFNG-AS1")|
DimPlot(bc_snRNA, group.by = "ident", label = FALSE)+
scale_color_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green", "darkgreen")))
(FeaturePlot(bc_snRNA, features = "IFNG-AS1")|
DimPlot(bc_snRNA, group.by = "ident", label = FALSE)+
scale_color_manual(values = c("#b2df8a","#e41a1c","#377eb8","#4daf4a","#ff7f00","gold",
"#a65628", "#999999", "black", "pink", "purple", "brown",
"grey", "yellow", "green", "darkgreen")))
Let’s have a look at our annotations again.
anchors <- FindTransferAnchors(reference = bc_snRNA, query = breast_cancer,normalization.method = "SCT")
Performing PCA on the provided reference using 2442 features as input.
Projecting PCA
Finding neighborhoods
Finding anchors
Found 3276 anchors
Filtering anchors
Retained 1572 anchors
Extracting within-dataset neighbors
predictions.assay <- TransferData(anchorset = anchors, refdata = bc_snRNA$subclass, prediction.assay = TRUE,
weight.reduction = breast_cancer[["pca"]])
Finding integration vectors
Finding integration vector weights
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Predicting cell labels
breast_cancer[["predictions"]] <- predictions.assay
DefaultAssay(breast_cancer) <- "predictions"
NOTE we’ll work with subclass
What about the immune microenvironment?
Ducal Carcinoma in situ is depleted of T-cells
ca <- readbitmap::read.bitmap("/mnt/home/stephen.williams/yard/Odin/SIB_2020_Workshop/images/Breast Cancer Path.png")
# in the tutorial
# ca <- readbitmap::read.bitmap('/mnt/libs/shared_data/human_breast_cancer_1/images/Breast_Cancer_Path.png')
plot(0:1,0:1,type="n",ann=FALSE,axes=FALSE)
rasterImage(ca,0,0,1,1)
B cells are enriched in the fibrous tissue outside the tumor
p1 <- SpatialFeaturePlot(breast_cancer,
features = c("T cells-0"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p2 <- SpatialFeaturePlot(breast_cancer,
features = c("T cells-1"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p3 <- SpatialFeaturePlot(breast_cancer,
features = c("T cells-2"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p4 <- SpatialFeaturePlot(breast_cancer,
features = c("T cells-5"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p1 + p2 + p3 + p4 + plot_layout(nrow = 2)
There seem to be some ductal cells but have a look at our score. Are we confident in this assertion?
SpatialFeaturePlot(breast_cancer,
features = c("B cells"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
It looks like the ducal carcinoma in situ is enriched for tumor subtypes 8, 10, and 12 but not 3.
SpatialFeaturePlot(breast_cancer,
features = c("Ductal cells"),
pt.size.factor = 1.5, crop = TRUE)
Like the Ductal Cells, we might not be as confident in the Tumor Stem Cells but this might make sense considering the 10x snRNA dataset and the Visium dataset are from different individuals.
p1 <- SpatialFeaturePlot(breast_cancer,
features = c("Tumor cells-3"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p1 <- SpatialFeaturePlot(breast_cancer,
features = c("Tumor cells-8"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p1 <- SpatialFeaturePlot(breast_cancer,
features = c("Tumor cells-10"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
p1 <- SpatialFeaturePlot(breast_cancer,
features = c("Tumor cells-12"),
pt.size.factor = 1.5, ncol = 2, crop = TRUE)
cowplot::plot_grid(p1,p2,p3,p4)
10x Genomics, patrick.roelli@10xgenomics.com ↩︎
10x Genomics, stefania.giacomello@10xgenomics.com↩︎
10x Genomics, stephen.williams@10xgenomics.com↩︎